A versatile parallel block-tridiagonal solver for spectral codes
نویسندگان
چکیده
Three-dimensional (3-D) processor configuration of a parallel solver is introduced to solve a massive block-tridiagonal matrix system in this paper. The purpose of the added parallelization dimension is to retard the saturation of the scaling due to communication overhead and an inefficient parallelization. The semi-empirical formula for the matrix operation count of the typical parallel algorithms is estimated including the saturation effect in 3-D processor grid. As the most suitable algorithm, the combined method of “Divide-and-Conquer” and “Cyclic Odd-Even Reduction” is implemented in a MPI-Fortran90 based numerical code named TORIC. The new 3-D parallel solver of TORIC using thousands of processors shows about 4 times improved computation speed at the optimized 3-D grid than the old 2-D parallel solver in the same condition.
منابع مشابه
A block-tridiagonal solver with two-level parallelization for finite element-spectral codes
Two-level parallelization is introduced to solve a massive block-tridiagonal matrix system. One-level is used for distributing blocks whose size is as large as the number of block rows due to the spectral basis, and the other level is used for parallelizing in the block row dimension. The purpose of the added parallelization dimension is to retard the saturation of the scaling due to communicat...
متن کاملGPGPU parallel algorithms for structured-grid CFD codes
A new high-performance general-purpose graphics processing unit (GPGPU) computational fluid dynamics (CFD) library is introduced for use with structured-grid CFD algorithms. A novel set of parallel tridiagonal matrix solvers, implemented in CUDA, is included for use with structured-grid CFD algorithms. The solver library supports both scalar and block-tridiagonal matrices suitable for approxima...
متن کاملBCYCLIC: A parallel block tridiagonal matrix cyclic solver
A block tridiagonal matrix is factored with minimal fill-in using a cyclic reduction algorithm that is easily parallelized. Storage of the factored blocks allows the application of the inverse to multiple right-hand sides which may not be known at factorization time. Scalability with the number of block rows is achieved with cyclic reduction, while scalability with the block size is achieved us...
متن کاملAlternating-Direction Line-Relaxation Methods on Multicomputers
We study the multicomputer performance of a three-dimensional Navier-Stokes solver based on alternating-direction line-relaxation methods. We compare several multicomputer implementations, each of which combines a particular line-relaxation method and a particular distributed block-tridiagonal solver. In our experiments, the problem size was determined by resolution requirements of the applicat...
متن کاملImplementation of a Fully - Balancedperiodic Tridiagonal Solver on Aparallel
While parallel computers ooer signiicant computational performance, it is generally necessary to evaluate several programming strategies. Two programming strategies for a fairly common problem|a periodic tridiagonal solver|are developed and evaluated. Simple model calculations as well as timing results are presented to evaluate these strategies. The particular tridiagonal solver evaluated is us...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010